---
title: Speech-to-Speech Capabilities in Kastrax | Kastrax Docs
description: Overview of speech-to-speech capabilities in Kastrax, including real-time interactions and event-driven architecture.
---

# Speech-to-Speech Capabilities in Kastrax ✅

## Introduction ✅

Speech-to-Speech (STS) in Kastrax provides a standardized interface for real-time interactions across multiple providers.  
STS enables continuous bidirectional audio communication through listening to events from Realtime models. Unlike separate TTS and STT operations, STS maintains an open connection that processes speech continuously in both directions.

## Configuration ✅

- **`chatModel`**: Configuration for the realtime model.
  - **`apiKey`**: Your OpenAI API key. Falls back to the `OPENAI_API_KEY` environment variable.
  - **`model`**: The model ID to use for real-time voice interactions (e.g., `gpt-4o-mini-realtime`).
  - **`options`**: Additional options for the realtime client, such as session configuration.
- **`speaker`**: The default voice ID for speech synthesis. This allows you to specify which voice to use for the speech output.

```typescript
const voice = new OpenAIRealtimeVoice({
  chatModel: {
    apiKey: 'your-openai-api-key',
    model: 'gpt-4o-mini-realtime',
    options: {
      sessionConfig: {
        turn_detection: {
          type: 'server_vad',
          threshold: 0.6,
          silence_duration_ms: 1200,
        },
      },
    },
  },
  speaker: 'alloy', // Default voice
});

// If using default settings the configuration can be simplified to:
const voice = new OpenAIRealtimeVoice();
```

## Using STS ✅

```typescript
import { Agent } from "@kastrax/core/agent";
import { OpenAIRealtimeVoice } from "@kastrax/voice-openai-realtime";
import { playAudio, getMicrophoneStream } from "@kastrax/node-audio";

const agent = new Agent({
  name: 'Agent',
  instructions: `You are a helpful assistant with real-time voice capabilities.`,
  model: openai('gpt-4o'),
  voice: new OpenAIRealtimeVoice(),
});

// Connect to the voice service
await agent.voice.connect();

// Listen for agent audio responses
agent.voice.on('speaker', ({ audio }) => {
  playAudio(audio);
});

// Initiate the conversation
await agent.voice.speak('How can I help you today?');

// Send continuous audio from the microphone
const micStream = getMicrophoneStream();
await agent.voice.send(micStream);
```

For integrating Speech-to-Speech capabilities with agents, refer to the [Adding Voice to Agents](../agents/adding-voice.mdx) documentation.